Rank | Count | Beginning |
---|---|---|
26847 | 10648 | Η |
80750 | 9559 | Τὸ |
16846 | 8538 | Ὁ |
51925 | 4270 | Οἳ |
80754 | 2370 | Τα |
6432 | 1454 | Από |
73772 | 1381 | Στην |
76842 | 1228 | Στο |
17950 | 1196 | Είναι |
75640 | 1194 | Στις |
44539 | 1125 | Με |
45530 | 1092 | Μετά |
70549 | 1076 | Σε |
13345 | 1048 | Για |
84081 | 966 | Την |
41201 | 932 | Κατά |
73407 | 821 | Στη |
10264 | 772 | Αυτό |
92068 | 704 | Τον |
36713 | 668 | Ήταν |
25948 | 609 | Έχει |
63013 | 608 | Όταν |
72653 | 603 | Στα |
23384 | 555 | Επίσης |
99274 | 518 | Ωστόσο, |
47946 | 515 | Μια |
4311 | 505 | Αν |
24946 | 500 | Έτσι |
79111 | 487 | Σύμφωνα |
9515 | 441 | Αὐτὴ |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV